Overview

Dataset statistics

Number of variables16
Number of observations7907
Missing cells5518
Missing cells (%)4.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory988.5 KiB
Average record size in memory128.0 B

Variable types

NUM10
CAT6

Warnings

name has a high cardinality: 7457 distinct values High cardinality
host_name has a high cardinality: 1833 distinct values High cardinality
last_review has a high cardinality: 1001 distinct values High cardinality
neighbourhood is highly correlated with neighbourhood_groupHigh correlation
neighbourhood_group is highly correlated with neighbourhoodHigh correlation
last_review has 2758 (34.9%) missing values Missing
reviews_per_month has 2758 (34.9%) missing values Missing
name is uniformly distributed Uniform
id has unique values Unique
number_of_reviews has 2758 (34.9%) zeros Zeros
availability_365 has 1386 (17.5%) zeros Zeros

Reproduction

Analysis started2020-10-14 15:03:48.629469
Analysis finished2020-10-14 15:07:19.217779
Duration3 minutes and 30.59 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

id
Real number (ℝ≥0)

UNIQUE

Distinct7907
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean23388624.63
Minimum49091
Maximum38112762
Zeros0
Zeros (%)0.0%
Memory size61.8 KiB
2020-10-14T23:07:20.397195image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum49091
5-th percentile5219982.7
Q115821800.5
median24706270
Q332348500
95-th percentile37030477.3
Maximum38112762
Range38063671
Interquartile range (IQR)16526699.5

Descriptive statistics

Standard deviation10164162.07
Coefficient of variation (CV)0.4345771599
Kurtosis-0.9351301902
Mean23388624.63
Median Absolute Deviation (MAD)8146858
Skewness-0.4293241662
Sum1.849338549e+11
Variance1.033101905e+14
MonotocityStrictly increasing
2020-10-14T23:07:21.401263image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
228904951< 0.1%
 
188861111< 0.1%
 
506461< 0.1%
 
307215061< 0.1%
 
167439091< 0.1%
 
236047111< 0.1%
 
197974801< 0.1%
 
242112841< 0.1%
 
59750521< 0.1%
 
216714071< 0.1%
 
Other values (7897)789799.9%
 
ValueCountFrequency (%) 
490911< 0.1%
 
506461< 0.1%
 
563341< 0.1%
 
716091< 0.1%
 
718961< 0.1%
 
ValueCountFrequency (%) 
381127621< 0.1%
 
381104931< 0.1%
 
381093361< 0.1%
 
381082731< 0.1%
 
381051261< 0.1%
 

name
Categorical

HIGH CARDINALITY
UNIFORM

Distinct7457
Distinct (%)94.3%
Missing2
Missing (%)< 0.1%
Memory size61.8 KiB
Luxury hostel with in-cabin locker - Single mixed
 
13
Studio Apartment - Oakwood Premier
 
9
Inviting & Cozy 1BR APT 3 mins from Tg Pagar MRT
 
9
Tasteful & Cozy 1 BR near SGH/Tiong Bahru
 
8
Superhost 1BR APT in the heart of Tg Pagar
 
8
Other values (7452)
7858 
ValueCountFrequency (%) 
Luxury hostel with in-cabin locker - Single mixed130.2%
 
Studio Apartment - Oakwood Premier90.1%
 
Inviting & Cozy 1BR APT 3 mins from Tg Pagar MRT90.1%
 
Tasteful & Cozy 1 BR near SGH/Tiong Bahru80.1%
 
Superhost 1BR APT in the heart of Tg Pagar80.1%
 
Stylish 1BR Located 7 mins from Tg Pagar MRT80.1%
 
City-located 1BR loft apartment *BRAND NEW*80.1%
 
Furnished, Homely 2BR APT near Bouna Vista MRT70.1%
 
City-located studio loft apartment *BRAND NEW*70.1%
 
Single Capsule For 1 (Free Breakfast)70.1%
 
Other values (7447)782198.9%
 
2020-10-14T23:07:22.518147image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique7192 ?
Unique (%)91.0%
2020-10-14T23:07:23.601252image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length99
Median length40
Mean length37.99822942
Min length1

host_id
Real number (ℝ≥0)

Distinct2705
Distinct (%)34.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean91144807.41
Minimum23666
Maximum288567551
Zeros0
Zeros (%)0.0%
Memory size61.8 KiB
2020-10-14T23:07:24.584987image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum23666
5-th percentile3356540.3
Q123058075
median63448912
Q3155381142
95-th percentile248196938
Maximum288567551
Range288543885
Interquartile range (IQR)132323067

Descriptive statistics

Standard deviation81909095.31
Coefficient of variation (CV)0.8986699038
Kurtosis-0.7718327787
Mean91144807.41
Median Absolute Deviation (MAD)52147950
Skewness0.7373540654
Sum7.206819922e+11
Variance6.709099894e+15
MonotocityNot monotonic
2020-10-14T23:07:25.645430image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
664061772743.5%
 
84920072032.6%
 
2099138411572.0%
 
294208531411.8%
 
314645131141.4%
 
2195501511131.4%
 
24134121121.4%
 
1087733661091.4%
 
23722617841.1%
 
8948251831.0%
 
Other values (2695)651782.4%
 
ValueCountFrequency (%) 
236661< 0.1%
 
594983< 0.1%
 
1652092< 0.1%
 
1845961< 0.1%
 
2277961< 0.1%
 
ValueCountFrequency (%) 
2885675511< 0.1%
 
2885462011< 0.1%
 
2882499751< 0.1%
 
2881104671< 0.1%
 
2880165192< 0.1%
 

host_name
Categorical

HIGH CARDINALITY

Distinct1833
Distinct (%)23.2%
Missing0
Missing (%)0.0%
Memory size61.8 KiB
Jay
 
290
Alvin
 
249
Richards
 
157
Aaron
 
145
Rain
 
115
Other values (1828)
6951 
ValueCountFrequency (%) 
Jay2903.7%
 
Alvin2493.1%
 
Richards1572.0%
 
Aaron1451.8%
 
Rain1151.5%
 
Darcy1141.4%
 
Kaurus1121.4%
 
RedDoorz1091.4%
 
Alex1051.3%
 
Joey941.2%
 
Other values (1823)641781.2%
 
2020-10-14T23:07:26.715382image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique1092 ?
Unique (%)13.8%
2020-10-14T23:07:27.781443image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length35
Median length5
Mean length5.890603263
Min length1

neighbourhood_group
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size61.8 KiB
Central Region
6309 
West Region
 
540
East Region
 
508
North-East Region
 
346
North Region
 
204
ValueCountFrequency (%) 
Central Region630979.8%
 
West Region5406.8%
 
East Region5086.4%
 
North-East Region3464.4%
 
North Region2042.6%
 
2020-10-14T23:07:28.759408image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-10-14T23:07:29.696430image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:07:30.708233image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length17
Median length14
Mean length13.68205388
Min length11

neighbourhood
Categorical

HIGH CORRELATION

Distinct43
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size61.8 KiB
Kallang
1043 
Geylang
994 
Novena
537 
Rochor
536 
Outram
477 
Other values (38)
4320 
ValueCountFrequency (%) 
Kallang104313.2%
 
Geylang99412.6%
 
Novena5376.8%
 
Rochor5366.8%
 
Outram4776.0%
 
Bukit Merah4705.9%
 
Downtown Core4285.4%
 
Bedok3734.7%
 
River Valley3624.6%
 
Queenstown2663.4%
 
Other values (33)242130.6%
 
2020-10-14T23:07:31.833778image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique3 ?
Unique (%)< 0.1%
2020-10-14T23:07:32.643075image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length23
Median length7
Mean length8.419501707
Min length4

latitude
Real number (ℝ≥0)

Distinct4885
Distinct (%)61.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.314192465
Minimum1.24387
Maximum1.45459
Zeros0
Zeros (%)0.0%
Memory size61.8 KiB
2020-10-14T23:07:33.376137image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1.24387
5-th percentile1.277909
Q11.295795
median1.31103
Q31.32211
95-th percentile1.377771
Maximum1.45459
Range0.21072
Interquartile range (IQR)0.026315

Descriptive statistics

Standard deviation0.03057744427
Coefficient of variation (CV)0.02326709754
Kurtosis4.139245158
Mean1.314192465
Median Absolute Deviation (MAD)0.0133
Skewness1.722879931
Sum10391.31982
Variance0.0009349800983
MonotocityNot monotonic
2020-10-14T23:07:34.453377image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
1.3114190.1%
 
1.3112580.1%
 
1.3113780.1%
 
1.3140370.1%
 
1.2837670.1%
 
1.3116370.1%
 
1.3110260.1%
 
1.3156560.1%
 
1.3124460.1%
 
1.3152360.1%
 
Other values (4875)783799.1%
 
ValueCountFrequency (%) 
1.243871< 0.1%
 
1.243911< 0.1%
 
1.245261< 0.1%
 
1.246271< 0.1%
 
1.248471< 0.1%
 
ValueCountFrequency (%) 
1.454591< 0.1%
 
1.453281< 0.1%
 
1.453011< 0.1%
 
1.452651< 0.1%
 
1.452031< 0.1%
 

longitude
Real number (ℝ≥0)

Distinct5414
Distinct (%)68.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean103.8487875
Minimum103.64656
Maximum103.97342
Zeros0
Zeros (%)0.0%
Memory size61.8 KiB
2020-10-14T23:07:35.466702image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum103.64656
5-th percentile103.759509
Q1103.835825
median103.84941
Q3103.872535
95-th percentile103.912734
Maximum103.97342
Range0.32686
Interquartile range (IQR)0.03671

Descriptive statistics

Standard deviation0.04367464259
Coefficient of variation (CV)0.0004205599666
Kurtosis1.970240678
Mean103.8487875
Median Absolute Deviation (MAD)0.01537
Skewness-0.738700154
Sum821132.3624
Variance0.001907474405
MonotocityNot monotonic
2020-10-14T23:07:36.820360image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
103.8602270.1%
 
103.8536170.1%
 
103.8429470.1%
 
103.8452360.1%
 
103.8386360.1%
 
103.8520160.1%
 
103.8466760.1%
 
103.8452860.1%
 
103.8401460.1%
 
103.8519260.1%
 
Other values (5404)784499.2%
 
ValueCountFrequency (%) 
103.646561< 0.1%
 
103.665471< 0.1%
 
103.681621< 0.1%
 
103.68521< 0.1%
 
103.685361< 0.1%
 
ValueCountFrequency (%) 
103.973421< 0.1%
 
103.972921< 0.1%
 
103.971711< 0.1%
 
103.971581< 0.1%
 
103.971051< 0.1%
 

room_type
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size61.8 KiB
Entire home/apt
4132 
Private room
3381 
Shared room
 
394
ValueCountFrequency (%) 
Entire home/apt413252.3%
 
Private room338142.8%
 
Shared room3945.0%
 
2020-10-14T23:07:37.863580image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-10-14T23:07:38.447829image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:07:39.281184image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length15
Median length15
Mean length13.51789554
Min length11

price
Real number (ℝ≥0)

Distinct374
Distinct (%)4.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean169.3329961
Minimum0
Maximum10000
Zeros1
Zeros (%)< 0.1%
Memory size61.8 KiB
2020-10-14T23:07:40.017519image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile35
Q165
median124
Q3199
95-th percentile381
Maximum10000
Range10000
Interquartile range (IQR)134

Descriptive statistics

Standard deviation340.1875991
Coefficient of variation (CV)2.008985886
Kurtosis464.4327957
Mean169.3329961
Median Absolute Deviation (MAD)64
Skewness19.09278291
Sum1338916
Variance115727.6026
MonotocityNot monotonic
2020-10-14T23:07:40.795441image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
602182.8%
 
502092.6%
 
1001892.4%
 
1501742.2%
 
1311712.2%
 
691702.1%
 
2001662.1%
 
1191521.9%
 
561521.9%
 
811461.8%
 
Other values (364)616077.9%
 
ValueCountFrequency (%) 
01< 0.1%
 
1440.1%
 
1550.1%
 
1840.1%
 
19280.4%
 
ValueCountFrequency (%) 
100003< 0.1%
 
89002< 0.1%
 
70002< 0.1%
 
69441< 0.1%
 
60001< 0.1%
 

minimum_nights
Real number (ℝ≥0)

Distinct73
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean17.51005438
Minimum1
Maximum1000
Zeros0
Zeros (%)0.0%
Memory size61.8 KiB
2020-10-14T23:07:41.559335image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median3
Q310
95-th percentile90
Maximum1000
Range999
Interquartile range (IQR)9

Descriptive statistics

Standard deviation42.09461647
Coefficient of variation (CV)2.404025456
Kurtosis69.89985985
Mean17.51005438
Median Absolute Deviation (MAD)2
Skewness6.102892196
Sum138452
Variance1771.956735
MonotocityNot monotonic
2020-10-14T23:07:42.360204image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
1208926.4%
 
2138817.6%
 
3113314.3%
 
905146.5%
 
74245.4%
 
304075.1%
 
54015.1%
 
42142.7%
 
61942.5%
 
181812.3%
 
Other values (63)96212.2%
 
ValueCountFrequency (%) 
1208926.4%
 
2138817.6%
 
3113314.3%
 
42142.7%
 
54015.1%
 
ValueCountFrequency (%) 
10001< 0.1%
 
7001< 0.1%
 
5001< 0.1%
 
365300.4%
 
3603< 0.1%
 

number_of_reviews
Real number (ℝ≥0)

ZEROS

Distinct208
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12.80738586
Minimum0
Maximum323
Zeros2758
Zeros (%)34.9%
Memory size61.8 KiB
2020-10-14T23:07:43.189026image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median2
Q310
95-th percentile66
Maximum323
Range323
Interquartile range (IQR)10

Descriptive statistics

Standard deviation29.70774597
Coefficient of variation (CV)2.319579209
Kurtosis25.41366328
Mean12.80738586
Median Absolute Deviation (MAD)2
Skewness4.42551345
Sum101268
Variance882.5501706
MonotocityNot monotonic
2020-10-14T23:07:44.011880image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0275834.9%
 
1108413.7%
 
25927.5%
 
33734.7%
 
42583.3%
 
52182.8%
 
61872.4%
 
71421.8%
 
81301.6%
 
91171.5%
 
Other values (198)204825.9%
 
ValueCountFrequency (%) 
0275834.9%
 
1108413.7%
 
25927.5%
 
33734.7%
 
42583.3%
 
ValueCountFrequency (%) 
3231< 0.1%
 
3071< 0.1%
 
2962< 0.1%
 
2911< 0.1%
 
2891< 0.1%
 

last_review
Categorical

HIGH CARDINALITY
MISSING

Distinct1001
Distinct (%)19.4%
Missing2758
Missing (%)34.9%
Memory size61.8 KiB
2019-08-12
 
152
2019-08-11
 
128
2019-08-13
 
110
2019-08-10
 
87
2019-08-08
 
78
Other values (996)
4594 
ValueCountFrequency (%) 
2019-08-121521.9%
 
2019-08-111281.6%
 
2019-08-131101.4%
 
2019-08-10871.1%
 
2019-08-08781.0%
 
2019-08-04740.9%
 
2019-08-05660.8%
 
2019-08-25640.8%
 
2019-07-29620.8%
 
2019-07-31620.8%
 
Other values (991)426654.0%
 
(Missing)275834.9%
 
2020-10-14T23:07:45.021222image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique423 ?
Unique (%)8.2%
2020-10-14T23:07:46.056032image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length10
Median length10
Mean length7.558366005
Min length3

reviews_per_month
Real number (ℝ≥0)

MISSING

Distinct527
Distinct (%)10.2%
Missing2758
Missing (%)34.9%
Infinite0
Infinite (%)0.0%
Mean1.043668674
Minimum0.01
Maximum13
Zeros0
Zeros (%)0.0%
Memory size61.8 KiB
2020-10-14T23:07:47.407913image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0.01
5-th percentile0.05
Q10.18
median0.55
Q31.37
95-th percentile3.8
Maximum13
Range12.99
Interquartile range (IQR)1.19

Descriptive statistics

Standard deviation1.285851237
Coefficient of variation (CV)1.232049279
Kurtosis8.021118779
Mean1.043668674
Median Absolute Deviation (MAD)0.44
Skewness2.330718695
Sum5373.85
Variance1.653413403
MonotocityNot monotonic
2020-10-14T23:07:48.190937image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
11722.2%
 
0.041041.3%
 
0.08961.2%
 
0.05931.2%
 
0.1921.2%
 
0.12921.2%
 
0.06911.2%
 
0.15750.9%
 
0.16750.9%
 
0.14740.9%
 
Other values (517)418552.9%
 
(Missing)275834.9%
 
ValueCountFrequency (%) 
0.013< 0.1%
 
0.02610.8%
 
0.03720.9%
 
0.041041.3%
 
0.05931.2%
 
ValueCountFrequency (%) 
131< 0.1%
 
12.61< 0.1%
 
121< 0.1%
 
11.031< 0.1%
 
8.371< 0.1%
 

calculated_host_listings_count
Real number (ℝ≥0)

Distinct55
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean40.60768939
Minimum1
Maximum274
Zeros0
Zeros (%)0.0%
Memory size61.8 KiB
2020-10-14T23:07:49.432149image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median9
Q348
95-th percentile203
Maximum274
Range273
Interquartile range (IQR)46

Descriptive statistics

Standard deviation65.13525309
Coefficient of variation (CV)1.604012789
Kurtosis4.166080549
Mean40.60768939
Median Absolute Deviation (MAD)8
Skewness2.149585925
Sum321085
Variance4242.601196
MonotocityNot monotonic
2020-10-14T23:07:50.539891image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
1196524.9%
 
26448.1%
 
33394.3%
 
2742743.5%
 
42162.7%
 
2032032.6%
 
672012.5%
 
61922.4%
 
71892.4%
 
81842.3%
 
Other values (45)350044.3%
 
ValueCountFrequency (%) 
1196524.9%
 
26448.1%
 
33394.3%
 
42162.7%
 
51752.2%
 
ValueCountFrequency (%) 
2742743.5%
 
2032032.6%
 
1571572.0%
 
1411411.8%
 
1141141.4%
 

availability_365
Real number (ℝ≥0)

ZEROS

Distinct359
Distinct (%)4.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean208.7263185
Minimum0
Maximum365
Zeros1386
Zeros (%)17.5%
Memory size61.8 KiB
2020-10-14T23:07:51.788200image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q154
median260
Q3355
95-th percentile365
Maximum365
Range365
Interquartile range (IQR)301

Descriptive statistics

Standard deviation146.1200345
Coefficient of variation (CV)0.7000556306
Kurtosis-1.602890685
Mean208.7263185
Median Absolute Deviation (MAD)104
Skewness-0.3055947725
Sum1650399
Variance21351.06448
MonotocityNot monotonic
2020-10-14T23:07:52.658592image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0138617.5%
 
36584310.7%
 
3643364.2%
 
3621501.9%
 
3581311.7%
 
3591171.5%
 
3631161.5%
 
361851.1%
 
356801.0%
 
360801.0%
 
Other values (349)458358.0%
 
ValueCountFrequency (%) 
0138617.5%
 
1150.2%
 
2190.2%
 
3160.2%
 
4110.1%
 
ValueCountFrequency (%) 
36584310.7%
 
3643364.2%
 
3631161.5%
 
3621501.9%
 
361851.1%
 

Interactions

2020-10-14T23:04:28.977365image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:04:31.719991image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:04:36.095268image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:04:37.895442image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:04:39.427434image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:04:40.925967image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:04:43.460144image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:04:45.156820image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:04:46.652965image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:04:48.206702image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:04:50.067971image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:04:51.236570image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:04:52.840189image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:04:53.827355image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:04:54.943004image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:04:56.330775image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:04:57.958133image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:04:59.154046image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:05:00.608732image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:05:03.205432image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:05:05.367713image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:05:07.299406image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:05:08.966464image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:05:10.495589image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:05:12.941559image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:05:14.473648image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:05:16.216823image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:05:18.038027image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:05:19.488026image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:05:20.986435image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:05:22.265568image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:05:23.834015image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:05:27.394847image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:05:29.222804image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:05:30.592659image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:05:31.991466image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:05:34.164335image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:05:35.635563image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:05:36.961343image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:05:38.001076image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:05:39.021348image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:05:40.749406image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:05:41.921974image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:05:43.025982image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:05:44.741609image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:05:46.054266image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:05:47.169285image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:05:48.219477image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:05:49.356321image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:05:50.468884image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:05:51.633203image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:05:52.849913image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:05:55.395950image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:05:57.729392image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:05:59.570684image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:06:01.705419image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:06:03.154886image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:06:04.801861image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:06:06.081099image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:06:09.328110image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:06:11.288188image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:06:14.189884image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:06:18.870766image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:06:21.097478image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:06:23.831746image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:06:25.874865image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:06:27.556956image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:06:28.809917image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:06:30.117261image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:06:31.256860image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:06:32.424881image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:06:33.674017image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:06:34.704549image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:06:35.734861image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:06:37.000856image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:06:38.851860image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:06:41.180817image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:06:43.280357image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:06:44.877799image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:06:46.756306image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:06:48.803933image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:06:50.297386image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:06:51.864659image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:06:52.993607image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:06:53.895988image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:06:54.955563image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:06:56.292658image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:06:57.379779image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:06:58.541713image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:07:00.192713image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:07:01.625943image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:07:03.298795image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:07:05.052165image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:07:05.893596image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:07:06.853065image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:07:07.848185image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:07:08.818135image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:07:09.978616image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:07:11.025985image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:07:12.078413image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2020-10-14T23:07:53.183191image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-10-14T23:07:53.814023image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-10-14T23:07:54.480273image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-10-14T23:07:55.150994image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2020-10-14T23:07:55.905867image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2020-10-14T23:07:14.488967image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:07:16.541215image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:07:17.804350image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-14T23:07:18.684207image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Sample

First rows

idnamehost_idhost_nameneighbourhood_groupneighbourhoodlatitudelongituderoom_typepriceminimum_nightsnumber_of_reviewslast_reviewreviews_per_monthcalculated_host_listings_countavailability_365
049091COZICOMFORT LONG TERM STAY ROOM 2266763FrancescaNorth RegionWoodlands1.44255103.79580Private room8318012013-10-210.012365
150646Pleasant Room along Bukit Timah227796SujathaCentral RegionBukit Timah1.33235103.78521Private room8190182014-12-260.281365
256334COZICOMFORT266763FrancescaNorth RegionWoodlands1.44246103.79667Private room696202015-10-010.202365
371609Ensuite Room (Room 1 & 2) near EXPO367042BelindaEast RegionTampines1.34541103.95712Private room2061142019-08-110.159353
471896B&B Room 1 near Airport & EXPO367042BelindaEast RegionTampines1.34567103.95963Private room941222019-07-280.229355
571903Room 2-near Airport & EXPO367042BelindaEast RegionTampines1.34702103.96103Private room1041392019-08-150.389346
6719073rd level Jumbo room 5 near EXPO367042BelindaEast RegionTampines1.34348103.96337Private room2081252019-07-250.259172
7241503Long stay at The Breezy East "Leopard"1017645BiancaEast RegionBedok1.32304103.91363Private room50901742019-05-311.88459
8241508Long stay at The Breezy East "Plumeria"1017645BiancaEast RegionBedok1.32458103.91163Private room54901982019-04-282.084133
9241510Long stay at The Breezy East "Red Palm"1017645BiancaEast RegionBedok1.32461103.91191Private room42902362019-07-312.534147

Last rows

idnamehost_idhost_nameneighbourhood_groupneighbourhoodlatitudelongituderoom_typepriceminimum_nightsnumber_of_reviewslast_reviewreviews_per_monthcalculated_host_listings_countavailability_365
789738092051New Small Room @Orchard/Somerset/Central Area262337792HermanCentral RegionRiver Valley1.29482103.83809Private room4090NaNNaN11258
789838092142Well connected 2 bedroom 2 bathroom apartment !223622603TanyaCentral RegionRochor1.30082103.84956Entire home/apt20020NaNNaN4347
789938094671SMALL ROOM FOR ONE @SOMERSET/ORCHARD/CENTRAL AREA262337792HermanCentral RegionRiver Valley1.29369103.83768Private room3370NaNNaN11359
790038102097环境优雅的公寓286260560BoWest RegionBukit Batok1.35654103.76028Private room9030NaNNaN183
7901381049712 PAX LOFT Close To Kent Ridge Park278109833BelleCentral RegionQueenstown1.27973103.78751Entire home/apt10030NaNNaN3161
790238105126Loft 2 pax near Haw Par / Pasir Panjang. Free Wifi278109833BelleCentral RegionQueenstown1.27973103.78751Entire home/apt10030NaNNaN3161
7903381082733bedroom luxury at Orchard238891646NehaCentral RegionTanglin1.29269103.82623Entire home/apt55060NaNNaN34365
790438109336[ Farrer Park ] New City Fringe CBD Mins to MRT281448565MindyCentral RegionKallang1.31286103.85996Private room58300NaNNaN3173
790538110493Cheap Master Room in Central of Singapore243835202HuangCentral RegionRiver Valley1.29543103.83801Private room56140NaNNaN230
790638112762Amazing room with private bathroom walk to Orchard28788520TerenceCentral RegionRiver Valley1.29672103.83325Private room65900NaNNaN7365